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Abstract — Estimation of a vector from quantized linear mea- 
surements is a common problem for which simple linear tech- 
niques are suboptimal — sometimes greatly so. This paper de- 
velops generalized approximate message passing (GAMP) algo- 
rithms for minimum mean-squared error estimation of a random 
vector from quantized linear measurements, notably allowing the 
linear expansion to be overcomplete or undercomplete and the 
scalar quantization to be regular or non-regular. GAMP is a 
recently-developed class of algorithms that uses Gaussian approx- 
imations in belief propagation and allows arbitrary separable 
input and output channels. Scalar quantization of measurements 
is incorporated into the output channel formalism, leading to 
the first tractable and effective method for high-dimensional 
estimation problems involving non-regular scalar quantization. 
Non-regular quantization is empirically demonstrated to greatly 
improve rate-distortion performance in some problems with 
oversampling or with undersampling combined with a sparsity- 
inducing prior. Under the assumption of a Gaussian measurement 
matrix with i.i.d. entries, the asymptotic error performance of 
GAMP can be accurately predicted and tracked through the state 
evolution formalism. We additionally use state evolution to design 
MSE-optimal scalar quantizers for GAMP signal reconstruction 
and empirically demonstrate the superior error performance of 
the resulting quantizers. 

Index Terms — analog-to-digital conversion, approximate mes- 
sage passing, belief propagation, compressed sensing, frames, 
non-regular quantizers, Slepian-Wolf coding, quantization, 
Wyner-Ziv coding 

I. Introduction 

ESTIMATION of a signal from quantized samples is a 
fundamental problem in signal processing. It arises both 
from the discretization in digital acquisition devices and the 
quantization performed for lossy compression. 

This paper considers of estimation of an i.i.d. vector x 
from quantized transformed samples of the form Q(z) where 
z = Ax is a linear transform of x and Q( ) is a scalar 
(componentwise separable) quantization operator Due to the 
transform A, the components of z may be correlated. Even 
though the traditional transform coding paradigm demonstrates 
the advantages of expressing the signal with independent 
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components prior to coding fT\, quantization of vectors with 
correlated components nevertheless arises in a range of cir- 
cumstances. For example, to model oversampled analog-to- 
digital conversion (ADC), we may write a vector of time- 
domain samples as z = Ax, where the entries of the vector 
X are statistically independent Fourier components and A 
is an oversampled inverse discrete Fourier transform. The 
oversampled ADC quantizes the correlated time-domain sam- 
ples z, as opposed to the Fourier coefficients x. Distributed 
sensing also necessitates quantization of components that are 
not independent since decorrelating transforms may not be 
possible prior to the quantization. More recently, compressed 
sensing has become a motivation to consider quantization of 
randomly linearly mixed information, and several sophisticated 
reconstruction approaches have been proposed (|2]-(3]. 

Estimation of a vector x from quantized samples of the 
form Q(Ax) is challenging because the quantization function 
Q(-) is nonlinear and the transform A couples, or "mixes," the 
components of x, thus necessitating joint estimation. Although 
reconstruction from quantized samples is typically linear, 
more sophisticated, nonlinear techniques can offer significant 
improvements in the case of quantized transformed data. A 
key example ADC, where the improvement from replacing 
conventional linear estimation with nonlinear estimation in- 
creases with the oversampling factor ['5l-[T3l. 

This paper focuses on using a simple message-passing 
algorithm based on belief propagation (BP). Implementation 
of BP for estimation of a continuous-valued quantity requires 
discretization of densities; this is inherently inexact and leads 
to high computational complexity. To handle quantization 
effects without any heuristic additive noise model II 4i and 
with low complexity, we use a recently-developed Gaussian- 
approximated BP algorithm, called generalized approximate 
message passing (GAMP) |15 | or relaxed belief propaga- 
tion 1 16 1, which extends earUer methods ifTTI . ifTSl to nonlinear 
output channels. 

A. Contributions 

Gaussian approximations of loopy BP have previously been 
shown to be effective in several other applications lfT6) - ll2T| : 
for our application to estimation from quantized samples, the 
extension to general output channels [ISJ, 1 16| is essential. 
Using this extension to nonlinear output channels, we show 
that GAMP-based estimation offer several key benefits: 
• General quantizers: The GAMP algorithm permits es- 
sentially arbitrary quantization functions Q( ) includ- 
ing non-uniform and even non-regular quantizers (i.e. 
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quantizers with cells composed of unions of disjoint 
intervals) used, for example, in Wyner-Ziv coding [22] 
and multiple description coding ||23l . In Section IVIIII 
we will demonstrate that a non-regular modulo quantizer 
can provide performance improvements for correlated 
data. We believe that the GAMP algorithm provides the 
first tractable estimation method that can exploit such 
quantizers. 

• General priors: GAMP-based estimation can incorporate 
a large class of priors on the components of x, provided 
that the components are independent. For example, in 
Section IVIIII we will demonstrate the algorithm on re- 
covery of vectors with sparse priors arising in quantized 
compressed sensing 0-131. 

• Exact characterization with random transforms: In the 
case of certain large random transforms A, the compo- 
nentwise performance of GAMP-based estimation can be 
precisely predicted by a so-called state evolution (SE) 
analysis reviewed in Section |VI] From the SE analysis, 
one can precisely evaluate any componentwise perfor- 
mance metric, including for example, mean-squared error 
(MSE). In contrast, works such as ll5l- lfT3l mentioned 
above have only obtained bounds or scaling laws. 

• Performance and optimality: Our simulations indi- 
cate significantly-improved performance over traditional 
methods for estimating from quantized samples in a range 
of scenarios. Moreover, for certain large random sparse 
transforms, the SE analysis provides testable conditions 
under which the GAMP reconstruction is provably opti- 
mal IBi. 

• Computational simplicity: The GAMP algorithm is com- 
putationally extremely fast. Our simulation and SE anal- 
ysis indicate good performance with a small number 
of iterations (10 to 20 in our experience), with the 
dominant computational cost per iteration simply being 
multiplication by A and A-^. 

• Applications to optimal quantizer design: When quantizer 
outputs are used as inputs to a nonlinear estimation 
algorithm, minimizing the MSE between quantizer inputs 
and outputs is generally not equivalent to minimizing 
the MSE of the final reconstruction f24|. To optimize 
the quantizer for the GAMP algorithm, we use the fact 
that the MSE under large random mixing matrices A 
can be predicted accurately from a set of simple SE 
equations ITSl . ||25]| . Then, by modeling the quantizer 
as a part of the measurement channel, we use the SE 
formalism to optimize the quantizer to minimize the 
asymptotic distortion after the reconstruction by GAMP. 
Note that our use of random A is for rigor of the SE 
formalism; the effectiveness of GAMP does not depend 
on this. 

B. Outline 

The remainder of the paper is organized as follows. Sec- 
tion In] provides basic background material on quantization, 
compressed sensing, and belief propagation. Section |lll] in- 
troduces the problem of estimating a random vector from 



quantized linear transform coefficients. It concentrates on 
geometric insights for both the oversampled and undersampled 
settings. The main results in this paper apply under a Bayesian 
formulation introduced in Section |IV] Note that this Bayesian 
formulation does not require sparsity of the signal nor spec- 
ify undersampling or oversampling. The use of generalized 
approximate message passing to find optimal estimates under 
this Bayesian formulation is derived in Section |V] Section IVll 
describes the use of SE to predict the performance of GAMP 
for our problem. Optimization of quantizers using SE is de- 
veloped in Section [VIll and experimental results are presented 
in Section IVIIII Section |IX] concludes the paper. 

C. Notation 

Vectors and matrices will be written in boldface type (A, 
X, y, . . . ) to distinguish from scalars written in normal 
weight (m, n, . . . ). Random and non-random quantities (or 
random variables and their realizations) are not distinguished 
typographically since the use of capital letters for random 
variables would conflict with the convention of using capital 
letters for matrices (or in the case of quantization, an operator 
on a vector rather than a scalar). The probability density 
function (p.d.f.) of random vector x is denoted px, and the 
conditional p.d.f. of y given x is denoted Py|x- When these 
densities are separable and identical across components, we 
repeat the previous notations: px for the scalar p.d.f. and py|x 
for the scalar conditional p.d.f. Writing x ^ Af{a, b) indicates 
that a; is a Gaussian random variable with mean a and variance 
b. The resulting p.d.f. is written as Px{t) — (t>{t ; a, b). 

II. Background 

This section establishes concepts and notations central to 
the paper For a comprehensive tutorial history of quantiza- 
tion, we recommend [26l; for an introduction to compressed 
sensing, |27|; and for the basics of belief propagation, [28 J- 
[30|. 

A. Scalar Quantization 

A K-\&ve\ scalar quantizer q : M — > M is defined by its out- 
put levels or reproduction points C = {ci}f£i and (partition) 
cells {q~^{ci)}fLi. It can be decomposed into a composition 
of two mappings q = f3 o a where a : M — > {1, 2, . . . , K} 
is the (lossy) encoder and f3 : {1, 2, . . . , K} — C is 
the decoder. The boundaries of the cells are called decision 
thresholds. One may allow /-C = oo to denote that C is 
countably infinite. 

A quantizer is called regular when each cell is a convex set, 
i.e., a single interval. Each cell of a regular scalar quantizer 
thus has a boundary of one point (if the cell is unbounded) or 
two points (if the cell is bounded). If the input to a quantizer 
is a continuous random variable, then the probability of the 
input being a boundary point is zero. Thus it suffices to specify 
the cells of a X-point regular scalar quantizer by its decision 
thresholds {fei}f£g, with 5o = —oo and bx — oo. The encoder 
satisfies 

a{x) — i for x G 
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and the output for boundary points can be safely ignored. 

The lossy encoder of a non-regular quantizer can be decom- 
posed into the lossy encoder of a regular quantizer followed 
by a many-to-one integer-to-integer mapping. Suppose K- 
level non-regular scalar quantizer q' has decision thresholds 
{^i}i£o' 1^'- ^ lossy encoder of a regular quan- 

tizer with these decision thresholds. Since q' is not regular, 
K' > K. Let a' : M -> {1, 2, . . . , K} denote the lossy 
encoder of q' . Then a' = \ o a, where 

A:{1, 2, 2, 

is called a binning function, labeling function, or index assign- 
ment. The binning function is not invertible. 

The distortion of a quantizer q applied to scalar random 
variable x is typically measured by the MSB 

D = ¥.[{x-q{x)f]. 

A quantizer is called optimal at fixed rate R = logj K when 
it minimizes distortion D among all iiT-level quantizers. To 
optimize scalar quantizers under MSE distortion, it suffices to 
consider only regular quantizers; a non-regular quantizer will 
never perform strictly better 

While regular quantizers are optimal for the standard lossy 
compression problem, non-regular quantizers are sometimes 
useful when some information aside from q{x) is available 
when estimating x. Two key examples are Wyner-Ziv cod- 
ing ll22i and multiple description coding f23\. One method 
for Wyner-Ziv coding is to apply Slepian-Wolf coding across 
a block of samples after regular scalar quantization [31]; the 
Slepian-Wolf coding is binning, but across a block rather 
than for a single scalar In multiple description scalar quan- 
tization 1 32 1, two binning functions are used that together 
are invertible but individually are not. In these uses of non- 
regular quantizers, side information aids in recovering x with 
resolution commensurate with K' while the rate is only 
commensurate with K, with K' > K. 

Optimization of a quantizer can rarely be done exactly 
or analytically. One standard way of optimizing q is via 
the Lloyd algorithm, which iteratively updates the decision 
boundaries and output levels by applying necessary conditions 
for quantizer optimality. 

A quantizer Q : M™ R™ is called a scalar quantizer 
when it is the Cartesian product of m scalar quantizers qi : 
M — > R. In this paper, Q always represents a scalar quantizer 
with component quantizers {gi}™ 

B. Compressed Sensing 

Conventionally, one does not attempt to estimate an n- 
dimensional signal x from fewer than n scalar quantities; it 
would not seem to work from a simple counting of degrees 
of freedom. Compressed sensing (CS) Ii33l - ll35l encapsulates 
a variety of techniques for estimating x from m < n 
scalar linear measurements, possibly including some noise, by 
exploiting knowledge that x is sparse or approximately sparse 
in some given transform domain. Measurements are of the 
form 

z = Ax, (1) 



where A £ is the measurement matrix, or 

y = z + d = Ax + d, (2) 

where d e R™ is additive noise. Many theoretical guarantees 
for compressed sensing are given with high probability of 
success over a random selection of A. Note that it is always 
assumed that A is available when estimating x from z or y. 

In this paper, we simplify notation and expressions by 
assuming that x itself is sparse or approximately sparse 
without requiring the use of a transform domain. Also, since 
our focus is on estimation in the presence of degradation of 
measurements caused by quantization, we do not consider 
further the noiseless measurement model dl}. 

The most commonly-studied estimator for the measurement 
model dill is the lasso estimator |36| 

X = argmin (i||y - Ax||2 + 7||x|| i) , 

xSR" 

where algorithm parameter 7 > trades off data fidelity 
against sparsity of the solution. This may be interpreted as 
a Lagrangian form of the estimator 

X = argmin ||x||i, 

X : ||y— Ax||2<e 

which could be justified heuristically by ||d||2 < e. 

Most of the CS literature has considered signal recovery 
with no noise or with ||d||2 < e. However, in many practical 
applications, measurements have to be discretized to a finite 
number of bits. The effect of such quantization on the per- 
formance of CS reconstruction has been studied in ll37l . |[38l . 
In 1391 . high-resolution functional scalar quantization theory 
was used to design quantizers for lasso estimation. Better yet is 
to change the reconstruction algorithm: In l|2]-[|4j, the authors 
demonstrate that when d represents quantization error, 

d = Q(Ax) - Ax, 

significant improvements can be obtained by replacing the 
constraint ||y — Axjjj < e by one that uses the partition cells 
of the quantizers that compose Q. 

While convex optimization formulations are prominent in 
CS, estimation with generic convex program solvers often has 
excessively high computational cost. Thus, there is significant 
interest in greedy and iterative methods. The use of belief 
propagation for CS estimation was first proposed in |l40l ; 
however, as explained in Section III-CI belief propagation has 
high complexity for the estimation of continuous-valued quan- 
tities. Lower-complexity approximations to belief propagation 
were first proposed for CS estimation in [21]. To handle the 
effects of quantization precisely, in this paper we use the 
generalization of the technique of ilTSl . 11211 developed by 
Rangan ITSl . 

C. Belief Propagation 

Consider the problem of estimating a random vector x e R" 
from noisy measurements y e R™, where the noise is 
described by a measurement channel pyi^ that acts separably 
and identically on each entry of the vector z obtained via ([U. 
Moreover suppose that elements in the vector x are distributed 
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i.i.d. according to px- We can construct the following condi- 
tional probability distribution over random vector x given the 
measurements y: 

_^ n m 

Px|y(x I y) = -^Ylp^ixj) Y[py\^ (y» I Z^) , (3) 

j=l 1=1 

where Z is the normalization constant and Zi — (Ax);. In 
principle, it is possible to estimate each Xj by marginalizing 
this distribution. 

Belief propagation replaces the computationally intractable 
direct marginalization of px|y with an iterative algorithm. To 
apply BP, construct a bipartite factor graph G — (V, F, E) 
from ((3) and pass the following messages along the edges E 
of the graph: 

« P^{xj)Wp\^j{xj), (4a) 

I^Ujixj) cx pyi^{y^\ z,)Y[fil^^{x.j)dx\j, (Ah) 

where cx means that the distribution is to be normalized so 
that it has unit integral and integration is over all the elements 
of X except Xj. We refer to messages {p,i^j}(^i j-j^E as vari- 
able updates and to messages {p,i^j}^i j)^E as measurement 
updates. BP is initialized by setting P^i^j{xj) = p^{xj). 

Earlier works on BP reconstruction have shown that it 
is asymptotically MSE optimal under certain verifiable con- 
ditions. These conditions involve simple single-dimensional 
recursive equations called state evolution (SB), which predicts 
that BP is optimal when the corresponding SB admits a unique 
fixed point |17|, |25|. However, direct implementation of 
BP is impractical due to the dense structure of A, which 
implies that the algorithm must compute the marginal of 
a high-dimensional distribution at each measurement node; 
i.e., the integration in ( l4b] i is over many variables. Burther- 
more, integration must be approximated through some discrete 
quadrature rule. 

BP can be simplified through various Gaussian approxima- 
tions, including the relaxed BP method llT6l . IITtI and approx- 
imate message passing (AMP) f\5\, fT\\. Recent theoretical 
work and extensive numerical experiments have demonstrated 
that, in the case of certain large random measurement matrices, 
the error performance of both relaxed BP and AMP can also 
be accurately predicted by SB. 

III. Quantized Linear Bxpansions 

This paper focuses on the general quantized measurement 
abstraction of 

y = Q(Ax), (5) 

where x G R" is a signal of interest, A e jgmx" a linear 
mixing matrix, and Q : R™ — >■ R™ is a scalar quantizer. 
We will be primarily interested in (per-component) MSE 
ri~^E[||x— x|p] for various estimators x that depend on y. A, 
and Q. The cases of m > n and m < n are both of interest. 
We sometimes use z ~ Ax to simplify expressions. 



A. Overcomplete Expansions 

Let A G Rmx" have rank n. Then {a^}™]^ is a. frame in 
R", where af is row i of A. Rank n can occur only with 
m > n, so Ax is called an overcomplete expansion of x and 
Q(Ax) as in (|5]l is called a quantized overcomplete expansion. 
In some cases of interest, the frame may be uniform, meaning 
||ai|| = 1 for each i, or tight, meaning A^A = cl„ for some 
scalar c. 

Commonly-used linear reconstruction forms estimate 

X- Aty = AtQ(Ax), (6) 

where A^^ — (A-^A)^^A-^ is the pseudoinverse of A. Under 
several reasonable models, linear reconstruction has MSB 
inversely proportional to m. Bor example, suppose the frame is 
uniform and tight and x is an unknown deterministic quantity. 
By modeling scalar quantization yi — qi{zi) with an additive 
noise as 

yi = Zi+ di (7a) 

where 

E[d,] = 0, (7b) 
E[didj] = o-j^ij, (7c) 

one can compute the MSB to be na^/m PTl . 

Bven when the model (|7]l is accurate ll42l . the linear 
reconstruction (|6]l may be far from optimal. More sophisti- 
cated algorithms have focused on enforcing consistency of an 
estimate with the quantized samples. A nonlinear estimate may 
exploit the boundedness of the sets 

S^{y,) = {x e R" I q,{z,) = yj, i ^ I, 2, . . . , m, 

which we call single-sample consistent sets. Assuming for now 
that scalar quantizer qi is regular and its cells are bounded, the 
boundary of Si{yi) is two parallel hyperplanes. The full set of 
hyperplanes obtained for one index i by varying yi over the 
output levels of qi is called a hyperplane wave partition |^3l, 
as illustrated for a uniform quantizer in Bigure [T}a). The set 
enclosed by two neighboring hyperplanes in a hyperplane 
wave partition is called a slab; one slab is shaded in Big- 
ure [Ha). Intersecting Si{yi) for n distinct indexes specifies an 
n-dimensional parallelotope as illustrated in BiguredJb). Using 
more than n of these single-sample consistent sets restricts x 
to a finer partition, as illustrated in Bigure [Tic) for m = 3. 
The intersection 

m 

S{y) ^ [\S,{y,) 

i=l 

is called the consistent set. Since each Si{yi) is convex, 
one may reach S{y) asymptotically through a sequence of 
projections onto Si{yi) using each infinitely often IS), |I6|. 

In a variety of settings, nonlinear estimates achieve MSB 
inversely proportional to m?, which is the best possible 
dependence on m ||431 . The first result of this sort was 
in lis). When A is an oversampled discrete Bourier transform 
matrix and Q is a uniform quantizer, z = Ax represents 
uniformly quantized samples above Nyquist rate of a periodic 
bandlimited signal. Bor this case, it was proven in [|3 that 
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(a) (b) (C) 



Fig. 1: Visualizing the information present in a quantized overcomplete expansion of x G when each qi is a regular 
quantizer (a) A single hyperplane wave partition with one single-sample consistent set shaded, (b) Partition boundaries from 
two hyperplane waves; x is specified to the intersection of two single-sample consistent sets, which is a bounded convex cell, 
(c) Partition from part (b) in dashed hnes with a third hyperplane wave added in solid lines. 



any x G S{y) has 0{m~'^) MSB, under a mild assumption 
on ||x||. This was extended empirically to arbitrary uniform 
frames in 171, where it was also shown that consistent estimates 
can be computed through a linear program. The techniques 
of alternating projections and linear programming suffer from 
high computational complexity; yet, since they generally find 
a corner of the consistent set (rather than the centroid), the 
MSB performance is suboptimal. 

Full consistency is not necessary for optimal MSB de- 
pendence on m. It was shown in \%\ that 0{m~^) MSB 
is guaranteed for a simple algorithm that uses each Si{yi) 
only once, recursively, under mild conditions on randomized 
selection of {a^}™ j^. These results were strengthened and 
extended to deterministic frames in [il3i . 

Quantized overcomplete expansions arise naturally in ac- 
quisition subsystems such as ADCs, where m/n represents 
oversampling factor relative to Nyquist rate. In such systems, 
high oversampling factor may be motivated by a trade-off 
between MSB and power consumption or manufacturing cost: 
within certain bounds, faster sampling is cheaper than a higher 
number of quantization bits per sample P4l . However, high 
oversampling does not give a good trade-off between MSB and 
raw number of bits produced by the acquisition system: com- 
bining the proportionality of bit rate R to number of samples 
TO with the best-case 9(to~^) MSB, we obtain Q{R~'^) MSB; 
this is poor compared to the exponential decrease of MSB with 
R obtained with scalar quantization of Nyquist-rate samples. 

Ordinarily, the bit-rate inefficiency of the raw output is 
made irrelevant by recoding, at or near Nyquist rate, soon 
after acquisition or within the ADC. An alternative explored 
in this paper is to combat this bit-rate inefficiency through the 
use of non-regular quantization. 

B. Non-Regular Quantization 

The bit-rate inefficiency of the raw output with regular 
quantization is easily understood with reference to Figure [Tfc). 
After yi and 7/2 are fixed, x is known to lie in the intersection 
of the shaded strips. Only four values of 7/3 are possible (i.e.. 



the solid hyperplane wave breaks Si (l)n52 (0) into four cells), 
and bits are wasted if this is not exploited in the representation 
of 2/3- 

Recall the discussion of generating a non-regular quantizer 
by using a binning function A in Section Hl-AI Binning does 
not change the boundaries of the single-sample consistent sets, 
but it makes these sets unions of slabs that may not even 
be connected. Thus, while binning reduces the quantization 
rate, in the absence of side information that specifies which 
slab contains x (at least with moderately high probability), 
it increases distortion significantly. The increase in distortion 
is due to ambiguity among slabs. Taking rn > n quantized 
samples together may provide adequate information to disam- 
biguate among slabs, thus removing the distortion penalty. 

The key concepts in the use of non-regular quantization 
are illustrated in Figure |2] Suppose one quantized sample 
yi specifies a single-sample consistent set composed 
of two slabs, such as the shaded region in Figure |2ja). A 
second quantized sample 7/2 will not disambiguate between 
the two slabs. In the example shown in Figure |2b), 52(2/2) is 
composed of two slabs, and Si{yi) n 52(2/2) is the union of 
four connected sets. A third quantized sample 2/3 may now 
completely disambiguate; the particular example of 53(2/3) 
shown in Figure |2c) makes S — Si{yi) 052(2/2) 053(2/3) a 
single convex set. 

When the quantized samples together completely disam- 
biguate the slabs as in the example, the rate reduction from 
binning comes with no increase in distortion. The price to pay 
comes in complexity of estimation. 

The use of binned quantization of linear expansions was 
introduced in |45|, where the only reconstruction method 
proposed is intractable in high dimensions because it is 
combinatorial over the binning functions. Specifically, using 
the notation from Section III-AI let the quantizer forming yi 
be defined by (a^, (3i,Xi). Then ^^^{(^^^{yi)) will be a set of 
possible values of ai{zi) specified by yi. One can try every 
combination, i.e., element of 

Ar'(/3r'(j/i)) X A2-i(/32-i(2/2)) X ... X A-i(/3-i(j/,„)), (8) 
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(a) (b) (C) 



Fig. 2: Visualizing the information present in a quantized overcomplete expansion of x € when using non-regular (binned) 
quantizers, (a) A single hyperplane wave partition with one single-sample consistent set shaded. Note that binning makes 
the shaded set not connected, (b) Partition boundaries from two hyperplane waves; x is specified to the intersection of two 
single-sample consistent sets, which is now the union of four convex cells, (c) A third sample now specifies x to within a 
consistent set S that is convex. 



to seek a consistent estimate. If the binning is effective, most 
combinations yield an empty consistent set; if the slabs are 
disambiguated, exactly one combination yields a non-empty 
set, which is then the consistent set S. This technique has 
complexity exponential in m (assuming non-trivial binning). 
The recent manuscript (46] provides bounds on reconstruction 
error for consistent estimation with binned quantization; it 
does not address algorithms for reconstruction. 

This paper provides a tractable and effective method for 
reconstruction from a quantized linear expansion with non- 
regular quantizers. To the best of our knowledge, this is the 
first such method. 

C. Undercomplete Expansions 

Maintaining the quantized measurement model let us 
turn to the case of rn < n. We now call Q(Ax) a quantized 
undercomplete expansion of x. 

Since the rank of A is less than n, A is a many-to-one 
mapping. Thus, even without quantization, one cannot recover 
X from Ax. Rather, Ax specifies a proper subspace of M" 
containing x; when A is in general position, the subspace is 
of dimension n — m. Quantization increases the ambiguity in 
the value of x, yielding consist sets similar to those depicted in 
Figures [Tf a) and|2ja). However, as described in Section Hl-BI 
knowledge that x is sparse or approximately sparse could be 
exploited to enable accurate estimation of x from Q(Ax). 

For ease of explanation, consider only the case where x is 
known to be fc-sparse with k < m. Let J" C {1, 2, . . . , n} he 
the support (sparsity pattern) of x, with \ J'\ — k. The product 
Ax is equal to Aj-xj-, where xj denotes the restriction 
of the domain of x to and Aj is the m x k submatrix 
of A containing the j7-indexed columns. Assuming A j has 
rank k (i.e., full rank), Q(Ax) = Q{Ajxj) is a quantized 
overcomplete expansion of xj. All discussion of estimation 
of xj from the previous subsections thus applies, assuming 
J' is known. 

The key remaining issue is that Q(Ax) may or may not 
provide enough information to infer J'. In an overcomplete 



representation, most vectors of quantizer outputs cannot occur; 
this redundancy was used to enable binning in Figure |2] and 
it can be used to show that certain subsets J' are inconsistent 
with the sparse signal model. In principle, one may enumerate 
the sets J' of size k and apply a consistent reconstruction 
method for each J. If only one candidate J' yields a non- 
empty consistent set, then J' is determined. This is intractable 
except for small problem sizes because there are (^) candi- 
dates for J'. 

The key concepts are illustrated in Figure |3] To have an 
interpretable diagram with k < m < n, we let (fc, m, n) = 
(1,2,3) and draw the space of unquantized measurements z e 
M^. (This contrasts with Figures [T] and |2] where the space of 
X e is drawn.) The vector x has one of (^) = (i) = 3 
possible supports J'. Thus, z lies in one of 3 subspaces of 
dimension 1, which are depicted by the angled solid lines. 
Scalar quantization of z corresponds to separable partitioning 
of with cell boundaries aligned with coordinate axes, as 
shown with lighter solid lines. 

Only one quantized measurement yi is not adequate to 
specify J', as shown in Figure [3ja) by the fact that a sin- 
gle shaded cell intersects all the subspacesQ Two quantized 
measurements together will usually specify J', as shown in 
Figure [Sjb) by the fact that only one subspace intersects the 
specified square cell; for fixed scalar quantizers, ambiguity 
becomes less likely as k decreases, n increases, m increases, 
or ||x|| increases. Figure |3lc) shows a case where non-regular 
(binned) quantization still allows unambiguous determination 
of J. 

The naive reconstruction method implied by Figure |3lc) is 
to search combinatorially over both J' and the combinations 
in dHJ; this is extremely complex. While the use of binning 
for quantized undercomplete expansions of sparse signals has 
appeared in the literature, first in [45j and later in [,46] , to 
the best of our knowledge this paper is the first to provide a 

'intersections with two subspaces are shown within the range of the 
diagram. 



KAMILOV, GOYAL, AND RANGAN 



7 



(a) 




(C) 



Fig. 3: Visualizing the information present in a quantized undercomplete expansion Q(Ax) of a 1-sparse signal x G R'^ 
when Ax e M^. The depicted 2-dimensional plane represents the vector of measurements z = Ax. Since x is 1-sparse, the 
measurement lies in a union of 1 -dimensional subspaces (the angled solid lines); since x is 3 dimensional, there are three such 
subspaces. (a) Scalar quantization of zi divides the plane of possible values for z into vertical strips. One particular value of 
Ui = qi{zi) does not specify which entry of x is nonzero since the shaded strip intersects all the angled solid lines. For each 
possible support, the value of the nonzero entry is specified to an interval, (b) Scalar quantization of both components of z 
specifies z to a rectangular cell. In most cases, including the one highlighted, the quantized values specify which entry of x 
is nonzero because only one angled solid line intersects the cell. The value of the nonzero entry is specified to an interval, (c) 
In many cases, including the one highlighted, the quantizers can be non-regular (binned) and yet still uniquely specify which 
entry of x is nonzero. 



z s 

— © — ^ 



y 



GAMP 



Fig. 4: Quantized linear measurement model for which GAMP 
estimator is derived. Vector x G K" with an i.i.d. prior is 
estimated from scalar quantized measurements y £ M"'. The 
quantizer input s is the sum of z = Ax S K.™ and an i.i.d. 
Gaussian noise vector w. Including noise variance in the 
model clarifies certain derivations; setting the noise variance 
to zero recovers acquisition model (H). 

tractable and effective reconstruction method. 

IV. Estimation from Quantized Samples: 
Bayesian Formulation 

We now specify more explicitly the class of problems for 
which we derive new estimation algorithms. Generalizing (|5]l, 
let 

y = Q(z + w) where z = Ax, (9) 

as depicted in Figure |4] The input vector x e M" is random 
with i.i.d. entries with prior p.d.f. p^- The linear mixing matrix 
A e R"ixn is random with i.i.d. entries ~ J\f{0,l/m). 
The (pre-quantization) additive noise w e R™ is random 
with i.i.d. entries Wi ~ A/'(0, a^). The quantizer Q is a scalar 
quantizer, and each of its component quantizers qi is identical 
and has K output levels. 

The estimator x is a function of A, y, Q, and a^. We wish 
to minimize the MSB n^^E[||x - x|p]. 

Our primary interest is in the case of cr^ = 0, but allowing 
a nontrivial distribution for w is not only more general but 
also makes the derivations more clear. 



V. Generalized Approximate Message Passing for 
A Quantizer Output Channel 

The acquisition model ^ is suitable for GAMP estimation 
under the conditions in ifTSl after one simple observation: the 
mapping from z to y is a separable probabilistic mapping 
with identical marginals. Specifically, quantized measurement 
Hi indicates Si e q^^{yi), so each component output channel 
can be characterized as 



PyUV I z) 



where 



is the Gaussian function 
1 



(j){t]a, b) 



: exp 



it- 



GAMP can be derived by approximating the updates in (|4|i 
by two scalar parameters each and introducing some first- 
order approximations, as discussed in |15|. Then given the 
estimation functions Fi„, £[„, Di, and D2 described below, for 
each iteration t ~ 0, 1, 2, the GAMP algorithm produces 
estimates x* of the true signal x according to the following 
rules: 

^ A^u* 



1 



A^u* 1 



Di fy,Ax* 



U*-1A2t*,A2?* 



y,Ax*-u*-iAV,AV 



C7% 



(lOa) 

(10b) 

(10c) 
(lOd) 



Note that in ( fTOl ) the notation denotes the element-wise 
product of a matrix with itself, i.e. (A^)^^ = {Aij^. The 
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estimation functions Fin, ^^in, Di, and D2 described below are 
applied to their inputs component-by-component. 

We refer to messages {xj,Tj}j(zv as variable updates and to 
messages {ui,Ti}i^p as measurement updates. The algorithm 
is initialized by setting Xj — Xinit, Tj — finit, and u^^ =0, 
where iinit and finit are the mean and variance of the prior px- 
The nonlinear functions and are the conditional mean 
and variance 

F,^{q,v) = E[x|q], 
fin (g, v) = var (x\q), 

where q = x ^ v with a; ~ Px and v ^ A/'(0, v). Note that 
these functions can easily be evaluated numerically for any 
given values of q and . Similarly, the functions and D2 
can be computed via 

Di{y,z,iy) = ^{Fout{y,z,i^) ~ z) , (11a) 

D2[y,z,iy) ^ -(1 ), (lib) 



V 



where the functions Fout and £out are the conditional mean and 
variance 

Fon,(y,z,z.) = E[z|zegri(y)] , (12a) 
■^^out (y, z, v) EE var (2 I 2 e q~^ (y)) , (12b) 

of the random variable 2 ~ J\f{z,i'). These functions admit 
closed-form expressions in terms of erf (2) = ^ e^* dt. 

VI. State Evolution for GAMP 

The equations (fTOl i are easy to implement, however they 
provide us no insight into the performance of the algorithm. 
The goal of SE equations is to describe the asymptotic 
behavior of GAMP under large random measurement matrices 
A. 

The SE for our setting in Figure |4]is given by the recursion 

1 



Tt+l — £in 



D2 (/3ft, C72 



(13) 



where t > is the iteration number, (3 — n/m is a fixed 
number denoting the measurement ratio, and is the variance 
of the additive white Gaussian noise (AWGN), which is also 
fixed. We initialize the recursion by setting fo — finit, where 
Tinit is the variance of Xj according to the prior px- We define 
the function £[„ as 



fin (ly) = E (q, v)] 



(14) 



where the expectation is taken over the scalar random variable 
q = X + V, with a; ^ Px and v ^ A/'(0, i^). Similarly, the 
function D2 is defined as 



D2 {iy,a^) =E[D2 (y,2,i/ + a2)] 



(15) 



where D2 is given by (11 Ibt and the expectation is taken over 
Py|z and (2, 2) ^ Af{0, Pj(i^)), with the covariance matrix 



p / \ I /3finit /3fmit 

* /3fimt - /3finit - 



(16) 



One of the main results of IfTSi , which is an extension 
of the analysis in flSl, was to demonstrate the convergence 
of the error performance of the GAMP algorithm to the SE 
equations. Specifically, these works consider the case where 
A is an i.i.d. Gaussian matrix, x is i.i.d. with a prior px 
and TO,n — > 00 with n/m ^ (3. Then, under some further 
technical conditions, it is shown that for any fixed iteration 
number t, the empirical joint distribution of the components 
) of the unknown vector x and its estimate x* converges 
to a simple scalar equivalent model parameterized by the 
outputs of the SE equations. From the scalar equivalent model, 
one can compute any asymptotic componentwise performance 
metric. It can be shown, in particular, that the asymptotic MSE 
is given simply by ff. That is. 



Tt = lim 

n— foo Jl 



1 " 1 

-Y,\xj-x]\^ = lim - 



(17) 



Thus, ft can be used as a metric for the design and analysis 
of the quantizer, although other non-squared error distortions 
could also be considered. Details are provided in [ITSI . 

The analysis in fW\ and (15] are for large i.i.d. Gaussian 
matrices. For certain large sparse random matrices, results in 
|25| and lfT6l show that the same SE equation holds and, 
in fact, additionally provide testable conditions under which 
GAMP is provably optimal. Specifically, it is shown that the 
SE recursion in (fTST l always admits at least one fixed point. 
As i 00 the recursion decreases monotonically to its largest 
fixed point and, if the SE admits a unique fixed point, then 
GAMP is asymptotically mean-square optimal. 

Thus, despite the fact that the prior on x may be non- 
Gaussian and the quantizer function Q( ) is nonlinear, one 
can precisely characterize the exact asymptotic behavior of 
GAMP at least for large random transforms. 

VII. Quantizer Optimization 

Ordinarily, quantizer designs depend on the distribution 
of the quantizer input, with an implicit aim of minimizing 
the MSE between the quantizer input and output. Often, 
only uniform quantizers are considered, in which case the 
"design" is to choose the loading factor of the quantizer When 
quantized data is used as an input to a nonlinear function, 
overall system performance may be improved by adjusting 
the quantizer designs appropriately (24]. In the present setting, 
conventional quantizer design minimizes m^^E[||z — Q(z)||^], 
but minimizing n^^E[||x — x|p] is desired instead. 

The SE description of GAMP performance facilitates the 
desired optimization. By modeling the quantizer as part of the 
channel and working out the resulting equations for GAMP 
and SE, we can make use of the convergence result (flTi to 
recast our optimization problem to 



Q^^ = arg min \ lim Tt > , 



(18) 



where minimization is done over all if-level regular scalar 
quantizers. Based on dTTI l. the optimization is equivalent to 
finding the quantizer that minimizes the asymptotic MSE. In 
the optimization ( fTsT i, we have considered the limit in the 
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iterations, t — J- cx). One can also consider the optimization with 
a finite t, although our simulations exhibit close to the limiting 
performance with a relatively small number of iterations. 

It is important to note that the SE recursion behaves well 
under quantizer optimization. This is due to the fact that SE 
is independent of actual output levels and small changes in 
the quantizer boundaries result in only minor change in the 
recursion (see (I12bt ). Although closed-form expressions for 
the derivatives of ft for large i's are difficult to obtain, we 
can approximate them by using finite difference methods. 
Finally, the recursion itself is fast to evaluate, which makes 
the scheme in ( fTSl l practically realizable under standard opti- 
mization methods. 

VIII. Experimental Results 

A. Overcomplete Expansions 

Consider overcomplete expansion of x as discussed in Sec- 
tion |IIPA] We generate the signal x with i.i.d. elements from 
the standard Gaussian distribution Xj ^ J^{0, 1). We form 
the measurement matrix A from i.i.d. zero-mean Gaussian 
random variables. To concentrate on the degradation due to 
quantization we assume noiseless measurement model (|5]i; i.e., 
CT^ ^ in (H. 

Figure |5] presents squared-error performance of three esti- 
mation algorithms while varying the oversampling ratio m/n 
and holding n = 100. To generate the plot we considered 
estimation from measurements discretized by a 16-level reg- 
ular uniform quantizer We set the granular region of the 
quantizer to [— Sct^jSct^], where cr^ — n/m is the variance 
of the measurements. For each value of m/n, 200 random 
realizations of the problem were generated; the curves show 
the median-squared error performance over these 200 Monte 
Carlo trials. We compare error performance of GAMP against 
two other common reconstruction methods: linear MMSE and 
maximum a posteriori probability (MAP). The MAP estimator 
was implemented using quadratic programming (QP). 

The MAP estimation is type of consistent reconstruction 
method proposed in |T|-fT3l; since the prior is a decreasing 
function of ||x||, the MAP estimate x is the vector consistent 
with Q(Ax) of minimum Euclidean norm. In the earlier 
works, it is argued that consistent reconstruction methods 
offer improved performance over linear estimation, particu- 
larly at high oversampling factors. We see in Figure |5] that 
MAP estimation does indeed outperform linear MMSE at 
high oversampling. However, GAMP offers significantly better 
performance than both LMMSE and MAP, with more than 
5 dB improvement for many values of m/n. In particular, this 
reinforces that MAP is suboptimal because it finds a comer 
of the consistent set, rather than the centroid. Moreover, the 
GAMP method is actually computationally simpler than MAP, 
which requires the solution to a quadratic program. 

With Figure |6] we turn to a comparison among quantizers, 
all with GAMP reconstruction, n = 100, m = 200, and x and 
A distributed as above. To demonstrate the improvement in 
rate-distortion performance that is possible with non-regular 
quantizers, we consider simple uniform modulo quantizers 

z 
-A- 



■e - Linear MMSE 

■♦-MAP 

-••-GAMP 




10" 



10^ 

Oversampling (m/n) 



10-^ 



Fig. 5: Performance comparison for oversampled observation 
of a jointly Gaussian signal vector (no sparsity). GAMP 
outperforms linear MMSE and MAP estimators. 



-♦-Lloyd 

Regular 
—&— Binned 




1.5 



2 2.5 3 

Rate (bits/measurement) 



3.5 



Q(^) 



mod iV, 



(19) 



Fig. 6: Performance comparison of GAMP with optimal uni- 
form quantizers under Gaussian prior for regular and binned 
quantizers. 



where A is the size of the quantization cells. These quantizers 
map the entire real line M to the set {0, 1, . . . , — 1} in a 
periodic fashion. 

We compare three types of quantizers: those optimized 
for MSE of the measurements {not the overall reconstruction 
MSE) using Lloyd's algorithm |26 1, regular uniform quantizers 
with loading factors optimized for reconstruction MSE using 
SE analysis, and (non-regular) uniform modulo quantizers with 
A optimized for reconstruction MSE using SE analysis. The 
last two quantizers were obtained by solving (fT¥i via the 
standard SQP method found in MATLAB. The uniform mod- 
ulo quantizer achieves the best rate-distortion performance, 
while the performance of the quantizer designed with Lloyd's 
algorithm is comparatively poor The stark non-optimality of 
the latter is due to the fact that it optimizes the MSE only 
between quantizer inputs and outputs, ignoring the nonlinear 
estimation algorithm following the quantizer. 
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It is important to point out that, without methods such as 
GAMP, estimation with a modulo quantizer such as ( fT9] l is not 
even computationally possible in works such as |5|-[13|, since 
the consistent set is non-convex and consists of a disjoint union 
of convex sets. Beyond the performance improvements, we 
believe that GAMP provides the first computationally-tractable 
and systematic method for such non-convex quantization re- 
construction problems. 

B. Compressive Sensing with Quantized Measurements 

We next consider estimation of an n-dimensional sparse 
signal X from m < n random measurements — a problem con- 
sidered in quantized compressed sensing [2|-|4|. We assume 
that the signal x is generated with i.i.d. elements from the 
Gauss-Bernoulli distribution 



AA(0,l/p), 
0, 



with probability p; 
with probability 1 — p, 



(20) 



where p is the sparsity ratio that represents the average fraction 
of nonzero components of x. In the following experiments we 
assume p = 1/32. Similarly to the overcomplete case, we 
form the measurement matrix A from i.i.d. Gaussian random 
variables and we assume no additive noise (cr^ = in 

Figure |2] compares MSB performance of GAMP with three 
other standard reconstruction methods. In particular, we con- 
sider linear MMSE and the Basis Pursuit DeNoise (BPDN) 
program iBTl 



arg mm 

xeR" 



|x||i s.t. ||y - Ax||p < e, 



where p — 2 and e G K+ is the parameter representing 
the noise power In the same figure, we additionally plot the 
error performance of the Basis Pursuit DeQuantizer (BPDQ) of 
moment p, proposed in JS), which solves the problem above for 
p > 2. It has been argued in f3 1 that BPDQ offers better error 
performance compared to the standard BPDN as the number 
of samples m increases with respect to the sparsity k of the 
signal X. 

We obtain the curves by varying the ratio m/n and holding 
n — 1024. We perform estimation from measurements ob- 
tained from a 16-level regular uniform quantizer with granular 
region of length 2|| Ax||oo centered at the origin. 

The figure plots the median of the squared error from 1000 
Monte Carlo trials for each value of m/n. For basis pursuit 
methods we optimize the parameter e for the best squared error 
performance; in practice this oracle-aided performance would 
not be achieved. The top curve (worst performance) is for 
linear MMSE estimation; and middle curves are for the basis 
pursuit estimators BPDN and BPDQ with moment p = 4. As 
expected, BPDQ achieves a notable 2 dB reduction in MSB 
compared to BPDN for high values of to, however GAMP 
significantly outperforms both methods over the whole range 
of m/n. 

In Figure [8] we compare the performance of GAMP under 
three quantizers consider before: those optimized for MSB 
of the measurements using Lloyd's algorithm, and regular 
and non-regular quantizers optimized for reconstruction MSB 
using SB analysis. We assume the same x and A distributions 
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Fig. 7: Performance comparison of GAMP with LMMSB, 
BPDN, and BPDQ (with moment p = 4) for estimation from 
compressive measurements. 
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Fig. 8: Performance comparison of GAMP with optimal uni- 
form quantizers under Gauss-Bernoulli prior for regular and 
binned quantizers. 



as above. We plot MSB of the reconstruction against the rate 
measured in bits per component of x. For each rate and for 
each quantizer, we vary the ratio m/n for the best possible 
performance. We see that, in comparison to regular quantizers, 
binned quantizers with GAMP estimation achieve much lower 
distortions for the same rates. This indicates that binning 
can be an effective strategy to favorably shift rate-distortion 
performance of the estimation. 

IX. Conclusions 

We have presented generalized approximate message pass- 
ing as an effective and efficient algorithm for estimation from 
quantized linear measurements. The GAMP methodology is 
general, allowing essentially arbitrary priors and quantization 
functions. In particular, GAMP is the first tractable and 
effective method for high-dimensional estimation problems 
involving non-regular scalar quantization. In addition, the 
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algorithm is computationally extremely simple and, in the 
case of large random transforms, admits a precise performance 
characterization using a state evolution analysis. 

The problem formulation is Bayesian, with an i.i.d. prior 
over the components of the signal of interest x; the prior 
may or may not induce sparsity of x. Also, the number of 
measurements may be more or less than the dimension of x, 
and the quantizers applied to the linear measurements may 
be regular or not. Experiments show significant performance 
improvement over traditional reconstruction schemes, some of 
which have higher computational complexity. Moreover, using 
extensions of GAMP such as hybrid approximate message 
passing 1481 . one may also in the future be able to consider 
quantization of more general classes of signals described by 
general graphical models. MATLAB code for experiments 
with GAMP is available online B9ll . 

Despite the improvements demonstrated here, we are not 
advocating quantized linear expansions as a compression 
technique — for the oversampled case or the undersampled 
sparse case; thus, comparisons to rate-distortion bounds would 
obscure the contribution. For regular quantizers and some fixed 
oversampling (3 = m/n > 1, the MSE decay with increasing 
rate is ^ 2"^^/*^, worse than the ~ 2~^^ distortion-rate 
bound. For a discussion of achieving exponential decay of 
MSE with increasing oversampling, while the quantization 
step size is held constant, see fSOl. For the undersampled 
sparse case, l,38J discusses the difficulty of recovering the 
support from quantized samples and the consequent difficulty 
of obtaining near-optimal rate-distortion performance. Perfor- 
mance loss rooted in the use of a random transformation A is 
discussed in ifSTI . 
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